16 research outputs found

    Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

    Full text link
    Recently, we proposed to transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. We continue the work by firstly introducing a third transformation to normalize the scale of the outputs of each hidden neuron, and secondly by analyzing the connections to second order optimization methods. We show that the transformations make a simple stochastic gradient behave closer to second-order optimization methods and thus speed up learning. This is shown both in theory and with experiments. The experiments on the third transformation show that while it further increases the speed of learning, it can also hurt performance by converging to a worse local optimum, where both the inputs and outputs of many hidden neurons are close to zero.Comment: 10 pages, 5 figures, ICLR201

    Deep Learning of Representations: Looking Forward

    Full text link
    Deep learning research aims at discovering learning algorithms that discover multiple levels of distributed representations, with higher levels representing more abstract concepts. Although the study of deep learning has already led to impressive theoretical results, learning algorithms and breakthrough experiments, several challenges lie ahead. This paper proposes to examine some of these challenges, centering on the questions of scaling deep learning algorithms to much larger models and datasets, reducing optimization difficulties due to ill-conditioning or local minima, designing more efficient and powerful inference and sampling procedures, and learning to disentangle the factors of variation underlying the observed data. It also proposes a few forward-looking research directions aimed at overcoming these challenges

    Accelerating Evolutionary Algorithms With Gaussian Process Fitness Function Models

    No full text

    Safe transient operation of microgrids based on master-slave configuration

    No full text
    Master-Slave configuration is a suitable alternative to droop control method used in microgrids. In this configuration, only one inverter is the master, while the others are slaves. The slave inverters are always current controlled whereas the master inverter should have two selectable operation modes: current controlled, when the microgrid is connected to the grid; and voltage controlled, when it is operating in island mode. In gridconnected mode, the master needs a synchronization system to perform the accurate control of its delivered power, and, in island mode, it needs a voltage reference oscillator that serves as a reference to the slave inverters. Based on the master-slave concept, this paper proposes a single system that perform both functions, i.e., it can act as a synchronization system or as a voltage reference oscillator depending on an input selector. Moreover, the system ensures a smoothly transition between the two operation modes, guaranteeing the safety operation of the microgrid. Experimental results are provided to confirm the effectiveness of the proposed system.Peer Reviewe

    Scalable Neural Networks for Board Games

    No full text
    Learning to solve small instances of a problem should help in solving large instances. Unfortunately, most neural network architectures do not exhibit this form of scalability. Our Multi-Dimensional Recurrent LSTM Networks, however, show a high degree of scalability, as we empirically show in the domain of flexible-size board games. This allows them to be trained from scratch up to the level of human beginners, without using domain knowledge

    Towards Adjusting Mobile Devices to User’s Behaviour

    No full text
    Abstract. Mobile devices are a special class of resource-constrained em-bedded devices. Computing power, memory, the available energy, and network bandwidth are often severely limited. These constrained re-sources require extensive optimization of a mobile system compared to larger systems. Any needless operation has to be avoided. Time-consuming operations have to be started early on. For instance, load-ing files ideally starts before the user wants to access the file. So-called prefetching strategies optimize system’s operation. Our goal is to ad-just such strategies on the basis of logged system data. Optimization is then achieved by predicting an application’s behavior based on facts learned from earlier runs on the same system. In this paper, we ana-lyze system-calls on operating system level and compare two paradigms, namely server-based and device-based learning. The results could be used to optimize the runtime behaviour of mobile devices
    corecore